Search CORE

96 research outputs found

Statistical Learning of Arbitrary Computable Classifiers

Author: Soloveichik David
Publication venue
Publication date: 22/06/2008
Field of study

Statistical learning theory chiefly studies restricted hypothesis classes, particularly those with finite Vapnik-Chervonenkis (VC) dimension. The fundamental quantity of interest is the sample complexity: the number of samples required to learn to a specified level of accuracy. Here we consider learning over the set of all computable labeling functions. Since the VC-dimension is infinite and a priori (uniform) bounds on the number of samples are impossible, we let the learning algorithm decide when it has seen sufficient samples to have learned. We first show that learning in this setting is indeed possible, and develop a learning algorithm. We then show, however, that bounding sample complexity independently of the distribution is impossible. Notably, this impossibility is entirely due to the requirement that the learning algorithm be computable, and not due to the statistical nature of the problem.Comment: Expanded the section on prior work and added reference

arXiv.org e-Print Archive

Caltech Authors

Robust Stochastic Chemical Reaction Networks and Bounded Tau-Leaping

Author: Soloveichik David
Publication venue: Mary Ann Liebert, Inc.
Publication date: 05/03/2009
Field of study

The behavior of some stochastic chemical reaction networks is largely unaffected by slight inaccuracies in reaction rates. We formalize the robustness of state probabilities to reaction rate deviations, and describe a formal connection between robustness and efficiency of simulation. Without robustness guarantees, stochastic simulation seems to require computational time proportional to the total number of reaction events. Even if the concentration (molecular count per volume) stays bounded, the number of reaction events can be linear in the duration of simulated time and total molecular count. We show that the behavior of robust systems can be predicted such that the computational work scales linearly with the duration of simulated time and concentration, and only polylogarithmically in the total molecular count. Thus our asymptotic analysis captures the dramatic speedup when molecular counts are large, and shows that for bounded concentrations the computation time is essentially invariant with molecular count. Finally, by noticing that even robust stochastic chemical reaction networks are capable of embedding complex computational problems, we argue that the linear dependence on simulated time and concentration is likely optimal

CiteSeerX

Caltech Authors

Stable Leader Election in Population Protocols Requires Linear Time

Author: Doty David
Soloveichik David
Publication venue
Publication date: 05/10/2015
Field of study

A population protocol *stably elects a leader* if, for all

n

, starting from an initial configuration with

n

agents each in an identical state, with probability 1 it reaches a configuration

\mathbf{y}

that is correct (exactly one agent is in a special leader state

\ell

) and stable (every configuration reachable from

\mathbf{y}

also has a single agent in state

\ell

). We show that any population protocol that stably elects a leader requires

\Omega(n)

expected "parallel time" ---

\Omega(n^2)

expected total pairwise interactions --- to reach such a stable configuration. Our result also informs the understanding of the time complexity of chemical self-organization by showing an essential difficulty in generating exact quantities of molecular species quickly.Comment: accepted to Distributed Computing special issue of invited papers from DISC 2015; significantly revised proof structure and intuitive explanation

arXiv.org e-Print Archive

CiteSeerX

HAL Descartes

Hal-Diderot

Hardness of Computing and Approximating Predicates and Functions with Leaderless Population Protocols

Author: Belleville Amanda
Doty David
Soloveichik David
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)
Publication date: 01/01/2017
Field of study

Population protocols are a distributed computing model appropriate for describing massive numbers of agents with very limited computational power (finite automata in this paper), such as sensor networks or programmable chemical reaction networks in synthetic biology. A population protocol is said to require a leader if every valid initial configuration contains a single agent in a special "leader" state that helps to coordinate the computation. Although the class of predicates and functions computable with probability 1 (stable computation) is the same whether a leader is required or not (semilinear functions and predicates), it is not known whether a leader is necessary for fast computation. Due to the large number of agents n (synthetic molecular systems routinely have trillions of molecules), efficient population protocols are generally defined as those computing in polylogarithmic in n (parallel) time. We consider population protocols that start in leaderless initial configurations, and the computation is regarded finished when the population protocol reaches a configuration from which a different output is no longer reachable. In this setting we show that a wide class of functions and predicates computable by population protocols are not efficiently computable (they require at least linear time), nor are some linear functions even efficiently approximable. It requires at least linear time for a population protocol even to approximate division by a constant or subtraction (or any linear function with a coefficient outside of N), in the sense that for sufficiently small gamma > 0, the output of a sublinear time protocol can stabilize outside the interval f(m) (1 +/- gamma) on infinitely many inputs m. In a complementary positive result, we show that with a sufficiently large value of gamma, a population protocol can approximate any linear f with nonnegative rational coefficients, within approximation factor gamma, in O(log n) time. We also show that it requires linear time to exactly compute a wide range of semilinear functions (e.g., f(m)=m if m is even and 2m if m is odd) and predicates (e.g., parity, equality)

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Complexity of Self-assembled Shapes (Extended Abstract)

Author: Soloveichik David
Winfree Erik
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

The connection between self-assembly and computation suggests that a shape can be considered the output of a self-assembly “program,” a set of tiles that fit together to create a shape. It seems plausible that the size of the smallest self-assembly program that builds a shape and the shape’s descriptional (Kolmogorov) complexity should be related. We show that under the notion of a shape that is independent of scale this is indeed so: in the Tile Assembly Model, the minimal number of distinct tile types necessary to self-assemble an arbitrarily scaled shape can be bounded both above and below in terms of the shape’s Kolmogorov complexity. As part of the proof of the main result, we sketch a general method for converting a program outputting a shape as a list of locations into a set of tile types that self-assembles into a scaled up version of that shape. Our result implies, somewhat counter-intuitively, that self-assembly of a scaled up version of a shape often requires fewer tile types, and suggests that the independence of scale in self-assembly theory plays the same crucial role as the independence of running time in the theory of computability